25 research outputs found
Linear convergence of accelerated conditional gradient algorithms in spaces of measures
A class of generalized conditional gradient algorithms for the solution of
optimization problem in spaces of Radon measures is presented. The method
iteratively inserts additional Dirac-delta functions and optimizes the
corresponding coefficients. Under general assumptions, a sub-linear
rate in the objective functional is obtained, which is sharp
in most cases. To improve efficiency, one can fully resolve the
finite-dimensional subproblems occurring in each iteration of the method. We
provide an analysis for the resulting procedure: under a structural assumption
on the optimal solution, a linear convergence rate is
obtained locally.Comment: 30 pages, 7 figure
Nonconvex penalization for sparse neural networks
Training methods for artificial neural networks often rely on
over-parameterization and random initialization in order to avoid spurious
local minima of the loss function that fail to fit the data properly. To
sidestep this, one can employ convex neural networks, which combine a convex
interpretation of the loss term, sparsity promoting penalization of the outer
weights, and greedy neuron insertion. However, the canonical penalty
does not achieve a sufficient reduction in the number of nodes in a shallow
network in the presence of large amounts of data, as observed in practice and
supported by our theory. As a remedy, we propose a nonconvex penalization
method for the outer weights that maintains the advantages of the convex
approach. We investigate the analytic aspects of the method in the context of
neural network integral representations and prove attainability of minimizers,
together with a finite support property and approximation guarantees.
Additionally, we describe how to numerically solve the minimization problem
with an adaptive algorithm combining local gradient based training, and
adaptive node insertion and extraction
Towards optimal sensor placement for inverse problems in spaces of measures
This paper studies the identification of a linear combination of point
sources from a finite number of measurements. Since the data are typically
contaminated by Gaussian noise, a statistical framework for its recovery is
considered. It relies on two main ingredients, first, a convex but non-smooth
Tikhonov point estimator over the space of Radon measures and, second, a
suitable mean-squared error based on its Hellinger-Kantorovich distance to the
ground truth. Assuming standard non-degenerate source conditions as well as
applying careful linearization arguments, a computable upper bound on the
latter is derived. On the one hand, this allows to derive asymptotic
convergence results for the mean-squared error of the estimator in the small
small variance case. On the other, it paves the way for applying optimal sensor
placement approaches to sparse inverse problems.Comment: 31 pages, 8 figure
The Protein Model Portal
Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at http://www.proteinmodelportal.org and from the PSI Structural Genomics Knowledgebase
The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods
The Protein Structure Initiative’s Structural Biology Knowledgebase (SBKB, URL: http://sbkb.org) is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI’s high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology